

# SPIRAL: Signal-Power Integrity Co-Analysis for High-Speed Inter-Chiplet Serial Links Validation

Xiao Dong<sup>1</sup>, Songyu Sun<sup>1</sup>, Yangfan Jiang<sup>1</sup>, Jingtong Hu<sup>2</sup>, Dawei Gao<sup>1,3</sup>, Cheng Zhuo<sup>1,4\*</sup>

<sup>1</sup>Zhejiang University, Hangzhou, China

<sup>2</sup>University of Pittsburgh, Pittsburgh, USA

<sup>3</sup>Zhejiang ICsprout Semiconductor Co., Ltd., Hangzhou, China

<sup>4</sup>Key Laboratory of Collaborative Sensing and Autonomous Unmanned Systems of Zhejiang Province, Hangzhou, China

\*Corresponding Email: czhuo@zju.edu.cn

Abstract—Chiplet has recently emerged as a promising solution to achieving further performance improvements by breaking down complex processors into modular components and communicating through high-speed inter-chiplet serial links. However, the ever-growing on-package routing density and data rates of such serial links inevitably lead to more complex and worse signal and power integrity issues than a large monolithic chip. This highly demands efficient analysis and validation tools to support robust design. In this paper, a signal-power integrity co-analysis framework for high-speed inter-chiplet serial links validation named SPIRAL is proposed. The framework first builds equivalent models for the links with a machine learningbased transmitter model and an impulse response based model for the channel and receiver. Then, the signal-power integrity is co-analyzed with a pulse response based method using the equivalent models. Experimental results show that SPIRAL yields eye diagrams with 0.82-1.85% mean relative error, while achieving 18-44× speedup compared to a commercial SPICE.

# I. INTRODUCTION

The increasing demand for high computing performance in emerging applications such as machine learning (ML), 5G mobile, automotive, and data centers has necessitated the development of energy-efficient and cost-effective systems. Achieving high computing speed and bandwidth requires integrating transistors at high density. Recently, the escalating cost of silicon manufacturing and the limitations of integrating components on a single chip have led to the development of chiplet-based high-density heterogeneous integration (HDHI). This approach involves mounting multiple chiplets from different technology nodes onto an interposer using ultra-high density 2.5D/3D heterogeneous integration, offering modularity, scalability, and technology partitioning features [1].

Chiplets communicate with each other through organic or silicon interposers [2]. As the operating frequency and data rates in chiplet-based systems increase to reduce latency and ensure the bit widths of interactive data packets, the interconnections require an *enormous amount of wiring resources* for parallel data exchange, leading to high wiring complexity. Additionally, to achieve power and thermal benefits, the power-supply levels of the chiplets are decreasing, resulting in reduced noise margins on both power and signal nets. Consequently, high-density interconnections face challenges related to signal and power integrity, caused by the steep



Fig. 1. Architecture example of the inter-chiplet serial links [4].

rise/fall edge of high-speed signals, in order to achieve energy-efficient designs [3].

Fig.1 illustrates an example of a single inter-chiplet serial link [4]. Given the complexity of the inter-chiplet link itself and the number of links specified in UCIe 1.0 [4], directly simulating such a system is extremely challenging and timeconsuming [5]. On the one hand, the transmitters (TXs) can exhibit high nonlinearity and hence demand accurate modeling. Simple behavioral models fail to capture the signal deterioration or the coupling to power supply noise (PSN), while employing transistor-level SPICE simulators can be excessively time-consuming. On the other hand, achieving the desired bandwidth may require tens to hundreds of links [4]. The signaling system itself, encompassing TX, equalizers, and channel parasitics, is already highly complex. The coupling between power and signal nets further complicates the system, making it challenging to accurately capture the electrical uncertainty caused by complex capacitive coupling. Furthermore, analyzing the power integrity (PI) and signal integrity (SI) behavior typically requires simulating thousands to tens of thousands of unit intervals (UIs), where even a few hundred UIs can consume hours to days to complete simulation in SPICE [6]. It is noted that for system level validation it is common to sacrifice some accuracy for simulation efficiency in order to check the long-term system behavior for various input vectors [7]. For example [8] reported the simulation of  $10^{13}$  UIs to validate the system. Such a long-time simulation then inevitably renders the validation of complex system infeasible [9]. Thus, there is a high demand for efficient signalpower integrity co-analysis for chiplet-based systems.

To address the challenges posed by nonlinearity, complex-

ity, and simulation time, we present SPIRAL: a framework for <u>signal-power integrity co-analysis of high-speed inter-chiplet serial links</u>. SPIRAL enables efficient implementation of heterogeneous integration with high interconnection density. The framework takes into account signal deterioration from signaling equalizations, I/O buffers, the coupling between the power distribution network (PDN) and signal nets, intersymbol interference (ISI) and the crosstalks within the signal nets. By using the physical-level (PHY) model of inter-chiplet links and PDN noise, signal-power integrity can be co-analyzed to evaluate system performance.

The main contributions of this paper are summarized as follows:

- The proposed SPIRAL framework can provide efficient and accurate enough signal and power integrity analysis for inter-chiplet system validation, which enables designers to fix the failures at an early design stage while reducing the risk of costly iterations and redesigns at later stages.
- By leveraging the pulse response as the fundamental unit and incorporating the time-invariant characteristics of the system, SPIRAL effectively and efficiently confines nonlinearity within the waveform of pulse response. The output signal is calculated using the pulse response based superposition, thereby significantly improving the efficiency of signal and power integrity co-analysis.
- ML-based models are developed to predict the behavior of mixed-signal modules in the link, accurately and efficiently capturing their nonlinearity and the coupling to PSN.

Experimental results demonstrate the effectiveness of SPIRAL in accurately modeling the nonlinear behavior of the TX buffer and channel response. Signal-power integrity co-analysis using SPIRAL yields eye diagrams with 0.82-1.85% mean relative error, while achieving significant speedup (18-44×) compared to a commercial SPICE [10].

## II. BACKGROUND

Many existing works in the literature rely on traditional S parameters and eye diagram simulations to analyze the behavior of inter-chiplet serial links [11], [12], which consume large time and computational cost when dealing with complex chiplet-based systems. Also, eye diagram simulations can be very time-consuming for complex high-speed interconnects. For example, simulating eye diagrams for a 3-channel interconnects with 1 million bits in data rate of 1Gb/s can take up to 22 hours for SPICE [13]. Only a few works have proposed systematic frameworks that construct an integral signal-power integrity co-analysis flow covering TXs, channels, and RXs in high-speed chiplet-based systems [1], [3], [14]. The work in [15] introduced new metrics in the frequency domain to evaluate SI of interconnect models but lacked the ability to perform fast analysis when various input signals were used. Another related work [16] presented a signal-power integrity and electromagnetic interference (EMI) co-analysis flow, but their methodology modeled the impact of PDN on signal channels as compact linear circuits, which can compromise accuracy.

Indeed, the scalability issue is one of the major challenges in designing complex chiplet-based systems with high interconnect density. As the number of inter-chiplet links increases, the simulation complexity and coupling effects between wires become more significant, which leads to poor scalability of existing simulation methods. This not only increases the design and optimization time for circuit designers but also makes it difficult to explore more intuitive design architectures in complex chiplet-based systems.

To address the above challenges, a scalable integral signalpower integrity co-analysis flow is needed, which enables comprehensive evaluation of channel performance and accurate prediction of serial links under different inputs. Such a flow can provide circuit designers with intuitive guidance for designing complex chiplet-based systems in an easier and faster manner.

#### III. PROPOSED FRAMEWORK

## A. Overall Flow

A representative inter-chiplet serial link system is shown in Fig. 1, which typically consists of multiple single-ended, unidirectional, full-duplex data lanes, with 16 data lanes for standard package and 64 for advanced package. Each unidirectional datapath comprises a TX, a channel, and an RX. The input data for TX is represented as a sequence of 0s and 1s denoted by  $x_1x_2...x_m$ , where  $x_i = 0, 1|i \in [1, m]$ . To transmit the data over the channel, the sequence is converted into a trapezoidal waveform, represented by u(t). The UI of the signal is determined by the clock frequency, which is simply half of the clock period. The converted signal is transmitted by the TX buffer, passes through the channel, and eventually arrives at the impedance-matched RX. For input data rates exceeding 24GT/s, a de-emphasis equalizer can be used to reduce the ISI from the TX. The PSN is often coupled to the TX output through the pull-up path of the TX circuitry, which can also affect the RX output.

Based on the above architecture, We here propose a signalpower integrity co-analysis framework for high-speed inter-



Fig. 2. Flow of the proposed SPIRAL framework.

chiplet serial links validation named SPIRAL, as shown in Fig. 2. The TX is a critical and complex module within the link, and its impact on the overall signal and power integrity must be modeled comprehensively. To achieve this, we employ artificial neural networks (ANNs) to capture the nonlinear behavior, PSN coupling, and de-emphasis of the TX, which allow for a more accurate and detailed analysis of the TX's impact on the signal and power integrity of the inter-chiplet serial links.

To model the channel and RX, we extract S-parameters from the channel-RX architecture and convert them into impulse responses in the time domain. By sending a pulse into the TX model and convolving the TX output with the impulse response, we can calculate the pulse response of the interconnect in advance and save it for later use. A pulse represents a trapezoidal wave with a period of 2 UIs and a duty cycle of 0.5, as shown in Fig. 2. The inter-chiplet serial link system itself is nonlinear but time-invariant. By considering the output in units of pulse response, the nonlinearity of the system can be confined within the pulse response. With input signals decomposed into a series of pulses, the output signal is equivalent to the superposition of corresponding pulse responses with certain delays. Thus, by fetching the previously saved pulse response, we can efficiently calculate the output signals, which involves a one-time effort of calculating the pulse response and greatly improves the efficiency of SI analysis.

To perform signal-power integrity co-analysis, we first predict the PSN coupled to the TX output using the trained PSN-coupled model and convolve it with the impulse response to obtain the PSN coupled to RX output. Then we superpose it on the original output signal without noise so as to obtain the final output coupling with PSN. By using this co-analysis framework, we can effectively analyze the signal and power integrity of high-speed inter-chiplet serial links.

#### B. TX Modeling

TX typically comprises a de-emphasis module, necessary for transmitting high-speed signals, and a buffer. A common de-emphasis module utilizes a two-tap finite impulse response (FIR) filter to reduce the low-frequency components and equalize the channel loss. Given the input data sequence of  $x_1x_2...x_m$ , the equalized data sequence  $x_1^dx_2^d...x_m^d$  can be calculated by

$$x_i^d = C_0 x_i + C_1 x_{i-1}, i = 1, ..., m$$
 (1)

Thus, the equalized data can be viewed as the convolution result of the input data and tap coefficients. The equalized waveform  $u^d(t)$  is obtained by linear interpolation of the equalized data.

To reduce the computational cost of high-frequency transient simulation, several studies have proposed using ML techniques to model the nonlinear behavior of buffers, such as [17]–[19]. However, previous studies focused only on fixed designs without considering the effects of output loads and PSN. Training a single ML model to capture all the complicated I/O

behaviors from both the circuit and external PSN would result in a heavy model and large training overhead. To address this, we first separate the effects of PSN coupling, de-emphasis, and the intrinsic nonlinearity of a TX. We then consider that their impacts can be superimposed later. We adopt ANNs to model the nonlinear behavior of the TX buffer. Finally, we train three separate models for the ideal (to capture TX nonlinearity), deemphasis (optional for low-speed signals), and PSN-coupled TX, respectively. It is worth noting that the buffer output not only depends on the input and its electrical properties but is also affected by the output loads. Therefore, we consider the channel and the RX in the modeling process and propose an equivalent circuit, as shown in Fig. 3. Since it is difficult for TX designers to assess the channel loss at the TX design stage. we connect an ideal lossless channel and a pull-up matched impedance in parallel with a load capacitance.



Fig. 3. Equivalent circuit for TX model.



Fig. 4. Architecture of ANN models for: (a) ideal model and de-emphasis model and (b) PSN-coupled model.

To predict the behavior of the buffer efficiently and accurately, we need to extract key features for training the model. Since the output signal changes with the loads, we first choose the load capacitance  $C_L$ , the characteristic impedance of the channel  $Z_0$ , and the pull-up voltage  $V_p$  as features. Additionally, the tap coefficient  $C_0$  is chosen as another feature for the de-emphasis model to capture the de-emphasis information. Due to the presence of capacitance in the buffer, the present output is determined by both present and past inputs. Thus, a vector of input signals  $[u(t-\alpha), u(t-\alpha+1), ..., u(t)]$  is selected as a feature for the ideal model, and  $[u^d(t-\beta), u^d(t-\beta)]$  $(\beta+1),...,u^d(t)$  is chosen for the de-emphasis model. For the PSN-coupled model, though we only focus on the voltage drop of the supply power, the coupling between the pull-up and pull-down modules of the TX buffer also causes minor noise in the output when it is low. So we extract a vector of noisy supply voltages  $[v(t-\gamma), v(t-\gamma+1), ..., v(t)]$  as well as input signals  $[u(t-\gamma), u(t-\gamma+1), ..., u(t)]$  as features. The time delay factors  $\alpha$ ,  $\beta$ , and  $\gamma$  depend on the buffer design. The outputs of the ideal and de-emphasis models are the buffer output signal y(t) and  $y^{d}(t)$ , respectively, while that of the PSN-coupled model is the difference between the noisy and

ideal output signals, denoted as d(t). The model outputs can be formulated as follows:

$$y(t) = f_i[u(t-\alpha), ..., u(t), C_L, Z_0, V_p]$$
 (2)

$$y^{d}(t) = f_{d}[u^{d}(t-\beta), ..., u^{d}(t), C_{L}, Z_{0}, V_{n}, C_{0}]$$
(3)

$$d(t) = f_p[v(t-\gamma), ..., v(t), u(t-\gamma), ..., u(t), C_L, Z_0, V_p]$$
 (4)

where  $f_i$ ,  $f_d$  and  $f_p$  represents ideal, de-emphasis and PSN-coupled model respectively.

The architecture of the ANNs used to model the nonlinear behavior of the TX buffer is shown in Fig. 4. Fig. 4(a) shows the architecture of ideal and de-emphasis model, which consists of an input layer, three hidden layers and an output layer. The only difference between ideal and de-emphasis model is that the latter has an additional input feature. We design a novel architecture for PSN-coupled model as shown in Fig. 4(b). The vector of supply voltage and input signal is firstly sent to a hidden layer with  $\gamma$  neurons. Then the output vector together with the other features are sent to three hidden layers and an output layer. The number of neurons in each of the last three hidden layers is 18 for all models. The activation function used in all hidden layers is the hyperbolic tangent (tanh), while the output layer has a linear activation function. We adopt transfer learning to accelerate the training. Once an ANN model has been trained on a TX design with good accuracy, it can serve as a pre-trained model to assist in the training of models on other TX designs.

#### C. Channel-RX Modeling

The output of the TX is then sent to the channel and RX. However, running transient simulations repeatedly with a commercial tool for every signal received can be time-consuming. To address this, we propose calculating the impulse response of the channel. By convolving the signal and impulse response, we can obtain the output signal efficiently and accurately. Suppose there is a system with n links, it can be regarded as a 2n-port network, as shown in Fig. 5, where the ground wire is omitted. We extract the S-parameters of channels with RXs to model the transfer relationship among I/O ports. Next, we derive the transfer function  $H_{ji}$  between input port i and output port j using the S-parameters:

$$\Gamma_{in} = S_{ii} + \frac{S_{ij}S_{ji}}{1 - S_{jj}}$$

$$H_{ji} = \frac{2S_{ji}}{(1 - S_{jj})(1 + \Gamma_{in})}$$
(5)

Then,  $H_{ji}$  is converted to impulse response  $h_{ji}$  with inverse fast fourier transform (IFFT).



Fig. 5. Schematic of a 2n-port network with ground wires omitted.

For the system with n links, we need to calculate  $n \times n$  impulse responses to describe the whole system response,

including the insertion loss response and far end crosstalk (FEXT) response. Then, the output signal  $s_k^{out}$  of port k can be calculated by convolving the input signal of all input ports with the corresponding impulse responses and summing them up as follows:

$$s_k^{out} = \sum_{i=1}^n s_{2i-1}^{in} * h_{k,2i-1}$$
 (6)

where  $s_{2i-1}^{in}$  is the input signal of port 2i-1 and  $h_{k,2i-1}$  is the impulse response between input port 2i-1 and output port k.

## D. Signal-Power Integrity Co-analysis

With the proposed TX and channel-RX models, the output signal of any input data can be obtained by convolving the TX output and impulse response of the channel and RX. However, it may still take a long time for ANN inference and convolution when dealing with long input data. Thus, we propose a more efficient method to calculate the output signal. Though the serial link system is nonlinear due to the existence of the TX buffer, it is time-invariant because when the input pulse shifts, the shape of the output waveform remains unchanged but only has the same time shift as the input pulse. Then, if we consider the input signal as a superposition of pulses with different time shifts, the nonlinearity of the system can be confined within the pulse response, and the output signal can be obtained by superposing the pulse responses of both insertion loss and FEXT with corresponding time shifts. Considering the delay and high-frequency effects in serial links, the pulse response can be longer than the input pulse [20]. For a 2n-port system with an input data length of m, the output signal  $z_k(t)$  of port k can be calculated by:

$$z_k(t) = \sum_{i=1}^n \sum_{\tau=1}^m r_{k,2i-1}(t - \tau \cdot UI) \cdot x_{\tau,2i-1}$$
 (7)

where  $x_{\tau,2i-1}$  represents the  $\tau$ th input data (0 or 1) of input port 2i-1 and  $r_{k,2i-1}(t)$  is the pulse response between input port 2i-1 and output port k. Fig. 6 shows an example of calculating the output signal of port 2 on a 4-port serial link system. By using this approach, the output signal can be efficiently calculated, avoiding the need for repeated ANN inference and convolution for each input sample. This significantly reduces the computational time and enables efficient SI analysis of long input data in the serial link system. In addition, if the serial link system is highly nonlinear, we can expand the pulse and pulse response to multiple bits to improve accuracy.



Fig. 6. Example of output signal calculation with the proposed SPIRAL on a 4-port serial link system.

Signal-power integrity co-analysis can be achieved given the PSN. The supply voltage obtained by subtracting PSN from the ideal voltage is firstly sent to the trained PSN-coupled model with other features to obtain the noise coupled to TX output. Then it is convolved with impulse response of channel and RX to obtain the output noise. By superposing the output noise to z(t), the effect of PSN on output signal can be analyzed.

## IV. EXPERIMENTAL RESULTS

#### A. TX Model Evaluation

| TX | Input signal |                 | Design parameters |               |          |
|----|--------------|-----------------|-------------------|---------------|----------|
| 1A | Amplitude    | Transition time | $C_L$ (pF)        | $Z_0(\Omega)$ | $V_p(V)$ |
| B1 | 1.0-1.8V     | 0.08-0.4ns      | 10-60             | 50-70         | 0.5-1.0  |
| B2 | 1.0-1.5V     | 1.6-5.6ps       | 0.01-0.5          | 40-70         | 0.4-0.8  |

We evaluate the performance of the proposed TX modeling on two different buffers. To train the TX models, we generate the dataset by sweeping the parameters and running simulations with a commercial SPICE [10]. The parameters and their range of interest are shown in Table I. For each data sample, we randomly select a group of parameters and perform transient simulations to obtain the simulated output signal as the ground truth. Additionally, PSN waves with various frequencies are generated to train the PSN-coupled model, with the worst-case noise set to 35% of the supply voltage. The ideal, de-emphasis, and PSN-coupled models for the two buffers are trained using the generated training set and validated on test set. The models are implemented in Python using PyTorch.

TABLE II COMPARISON OF ACCURACY AND RUNTIME BETWEEN THE PROPOSED TX MODEL AND SPICE

| TX | Model       | Mean AE/RE   | Max AE/RE     | Speedup |
|----|-------------|--------------|---------------|---------|
| B1 | Ideal       | 8.9mV/0.67%  | 93.2mV/6.97%  | 20×     |
|    | De-emphasis | 11.3mV/0.81% | 83.8mV/5.94%  | 18×     |
|    | PSN-coupled | 5.0mV/0.38%  | 62.8mV/4.62%  | 19×     |
| B2 | Ideal       | 11.5mV/0.89% | 77.6mV/5.95%  | 27×     |
|    | De-emphasis | 19.3mV/1.47% | 104.0mV/7.84% | 18×     |
|    | PSN-coupled | 8.0mV/0.65%  | 92.0mV/7.39%  | 21×     |

The performance of the proposed models compared to a commercial tool is summarized in Table II. The accuracy of the models is evaluated using metrics such as mean absolute error (AE), mean relative error (RE), maximum AE, and maximum RE. RE represents the ratio of AE to the amplitude of the input signal. As shown in the table, the proposed TX models achieve mean RE values ranging from 0.67% to 1.47%, with maximum RE values less than 8%. It is worth noting that for the de-emphasis model of B2, although there are a few sample points with larger RE (maximum RE of 7.84%), 96% of the sample points have RE values less than 5%. Additionally, the proposed TX models demonstrate faster runtime compared to the commercial tool, with speedups of 18-27× faster than the commercial SPICE. This improved runtime efficiency enables efficient simulation of the TX models. Fig. 7 visually compares the output waveforms between the predicted results and the ground truth, demonstrating the close fit between the predicted results and the simulation results.



Fig. 7. The comparison between the simulation results and the predicted output waveforms from: (a) ideal model for B1; (b) ideal model for B2; (c) de-emphasis model for B1; (d) de-emphasis model for B2; (e) PSN-coupled model for B1; and (f) PSN-coupled model for B2.

#### B. Channel-RX Model Evaluation

The performance of the proposed channel-RX model is compared to the commercial SPICE on a two-link network. Two different input signals with an amplitude of 1V are directly sent to the channels, and the RX output signals calculated by the proposed model are compared with the simulation results from SPICE. Fig. 8 showcases the close agreement between the proposed model and SPICE simulation with mean RE less than 2%, validating the accuracy and effectiveness of the proposed channel-RX model.



Fig. 8. The comparison of the output voltage waveform obtained from SPICE simulation and the proposed channel-RX model on a two-link system: (a) link 1; and (b) link 2.

# C. Signal-Power Integrity Co-analysis

The eye diagrams generated from SPIRAL are compared with simulation results of the commercial SPICE to validate the accuracy of SPIRAL on SI analysis without PSN and signal-power integrity co-analysis, respectively. Eye diagrams are generated on a range of various serial link designs with data rate of 12GT/s, in which B2 is adopted as TX. Input data with a length of 1000 UIs is randomly generated with various amplitudes and transition times. For each sample of input data, the eye diagrams obtained by SPIRAL and the commercial SPICE are compared by calculating the AEs and REs of the amplitude, eye height and eye width that are key parameters of eye diagram, in which RE represents the ratio of AE and simulated result of SPICE. The statistical results are shown in Table III. The proposed SPIRAL achieves high accuracy

TABLE III

EYE DIAGRAM PARAMETERS FROM SPIRAL COMPARED TO SPICE

| Para       | ameter     | SI analysis  | SI-PI co-analysis |
|------------|------------|--------------|-------------------|
| Amplitude  | Mean AE/RE | 11.5mV/1.00% | 12.2mV/1.23%      |
| Ampiitude  | Max AE/RE  | 29.5mV/2.55% | 28.6mV/3.17%      |
| Eye height | Mean AE/RE | 13.1mV/1.23% | 14.0mV/1.85%      |
| Lyc neight | Max AE/RE  | 54.1mV/4.44% | 36.1mV/6.67%      |
| Eye width  | Mean AE/RE | 0.64ps/0.82% | 0.93ps/1.38%      |
| Lyc widin  | Max AE/RE  | 3.60ps/4.71% | 2.99ps/4.67%      |



Fig. 9. Eye diagrams of: (a) SI analysis result from SPIRAL; (b) SI analysis result from SPICE; (c) signal-power integrity co-analysis result from SPIRAL; and (d) signal-power integrity co-analysis result from SPICE.

with 0.82-1.85% mean RE and 2.55-6.67% maximum RE. Fig. 9 shows the eye diagrams obtained by SPIRAL compared to simulation results of SPICE. Fig. 9(a) and 9(c) are the eye diagrams obtained by SPIRAL for SI analysis and SI-PI co-analysis respectively, while 9(b) and 9(d) are the SPICE simulation results. The eye diagrams generated from SPIRAL is highly similar to the simulation results.

To validate the efficiency of SPIRAL, we compare the SI analysis runtime of SPIRAL with the commercial SPICE on two different serial link designs. Input data of various lengths is randomly generated with the data rate of 24GT/s and the sampling step of output signal is 0.1ps. The SI analysis by SPIRAL and SPICE simulation are performed in the same environment for fair comparison. The result is shown in Fig. 10. The simulation time of SPICE varies according to the complexity of the design, while the runtime of SPIRAL almost remains the same. As the length of input data increases, the simulation time of SPICE increase significantly. It takes 378s and 927s for SPICE to run simulation in case 1 and case 2 with 1e5 UIs respectively, while our SPIRAL only needs 21s, which achieves 18-44× speedup compared to SPICE and calculation rate of up to 4600UI/s. Therefore, the proposed SPIRAL can greatly improve the efficiency of SI analysis.



Fig. 10. Runtime comparison between the proposed SPIRAL and SPICE.

#### V. CONCLUSIONS

This paper introduces SPIRAL, a signal-power integrity coanalysis framework designed specifically for high-speed interchiplet serial links validation. The framework utilizes equivalent models based on ANN and S-parameters to capture the PHY-level details of the TX, channel and RX. By leveraging the pulse response obtained from these equivalent models, SPIRAL enables efficient signal-power integrity co-analysis. Experimental results show that SPIRAL can achieve high accuracy and efficiency compared to a commercial SPICE.

## ACKNOWLEDGMENT

This work was supported in part by National Natural Science Foundation of China (Grant No. 62034007, 61974133, and 62141404) and SGC Cooperation Project (Grant No. M-0612).

#### REFERENCES

- [1] M. D. Rotaru *et al.*, "Design and development of high density fan-out wafer level package (hd-fowlp) for deep neural network (dnn) chiplet accelerators using advanced interface bus (aib)," in *Proc. ECTC*, 2021, pp. 1258–1263.
- [2] M. P. C. Mok et al., "Chiplet-based system-on-chip for edge artificial intelligence," in Proc. EDTM, 2021, pp. 1–3.
- [3] X. Duan et al., "Research on double-layer networks-on-chip for interchiplet data switching on active interposers," in Proc. ICEPT, 2021, pp. 1–6.
- [4] D. D. Sharma *et al.*, "Universal chiplet interconnect express (ucie): An open industry standard for innovations with chiplets at package level," *IEEE TCPMT*, vol. 12, no. 9, pp. 1423–1431, 2022.
- [5] S. Park et al., "Chip/package co-design analysis of advanced d2d interface using a statistical link simulator," in Proc. ECTC, 2021, pp. 1844–1849.
- [6] H. Park et al., "Design flow for active interposer-based 2.5-d ics and study of risc-v architecture with secure noc," IEEE TCPMT, vol. 10, no. 12, pp. 2047–2060, 2020.
- [7] C. Gu, "Challenges in post-silicon validation of high-speed i/o links," in *Proc. ICCAD*, 2012, p. 547–550.
- [8] X. Zeng et al., "High-speed link verification based on statistical inference," in Proc. ISCAS, 2016, pp. 906–909.
- [9] M. Abramovici, "In-system silicon validation and debug," *IEEE Des. Test. Comput.*, vol. 25, no. 3, pp. 216–223, 2008.
- [10] Hspice. [Online]. Available: https://www.synopsys.com/
- [11] L. T. Guan *et al.*, "Fowlp design for hbm applications," in *Proc. EPTC*, 2017, pp. 1–4.
- [12] Z. Wenle et al., "Study of high speed interconnects of multiple dies stack structure with through-silicon-via (tsv)," in Proc. EDAPS, 2010, pp. 1–4.
- [13] M. A. Dolatsara et al., "Worst-case eye analysis of high-speed channels based on bayesian optimization," *IEEE TEMC*, vol. 63, no. 1, pp. 246– 258, 2021.
- [14] M. Ahmed et al., "Bunch of wires interface phy design for multi-chiplet systems," in Proc. EPTC, 2021, pp. 395–398.
- [15] L. Kangrong et al., "Frequency domain methodology for evaluating signal integrity performance of logic to logic and hbm interconnect models for chiplet packaging," in Proc. EPTC, 2020, pp. 147–151.
- [16] M. Miao et al., "Co-design and signal-power integrity/emi co-analysis of a switchable high-speed inter-chiplet serial link on an active interposer," in Proc. ECTC, 2022, pp. 1329–1336.
- [17] B. Mutnury et al., "Macromodeling of nonlinear transistor-level receiver circuits," IEEE TADVP, vol. 29, no. 1, pp. 55–66, 2006.
- [18] S. A. Sadrossadat *et al.*, "Nonlinear electronic/photonic component modeling using adjoint state-space dynamic neural network technique," *IEEE TCPMT*, vol. 5, no. 11, pp. 1679–1693, 2015.
- [19] M. Moradi A. et al., "Long short-term memory neural networks for modeling nonlinear electronic components," *IEEE TCPMT*, vol. 11, no. 5, pp. 840–847, 2021.
- [20] D. Jiao et al., "Method for accurate and efficient signaling analysis of nonlinear circuits," in Proc. EMCSI, 2015, pp. 123–127.